Scalable QSF-Trees: Retrieving Regional Objects in High-Dimensional Spaces
نویسندگان
چکیده
Many database applications require effective representation of regional objects in high-dimensional spaces. By applying an original query transformation, a recently proposed access method for regional data, called the simple QSF-tree (sQSF-tree), effectively attacks the limitations of traditional spatial access methods in spaces with many dimensions. Nevertheless, sQSF-trees are not immune to all problems associated with high data dimensionality. Based on the analysis of sQSF-trees, this paper presents a new variant of sQSF-trees, called the scalable QSF-tree (cQSF-tree), which relies on a heuristic optimization to reduce the number of false drops into pages that contain no object satisfying the query. By increasing the selectivity of search predicates, cQSF-trees improve the performance of multidimensional selections. Experimental evidence shows that cQSF-trees are more scalable than sQSF-trees to the growing data dimensionality. The performance improvements also increase with more skewed data distribution.
منابع مشابه
Indexing Regional Objects in High-Dimensional Spaces
Many spatial access methods, such as the R-tree, have been designed to support spatial search operators (e.g., overlap, containment, and enclosure) over both points and regional objects in multi-dimensional spaces. Unfortunately, contemporary spatial access methods are limited by many problems that significantly degrade the query performance in high-dimensional spaces. This chapter reviews the ...
متن کاملClustering Large Datasets in Arbitrary Metric Spaces
Clustering partitions a collection of objects into groups called clusters, such that similar objects fall into the same group. Similarity between objects is defined by a distance function satisfying the triangle inequality; this distance function along with the collection of objects describes a distance space. In a distance space, the only operation possible on data objects is the computation o...
متن کاملA Scalable Framework for Information Visualization
This paper describes major concepts of a scalable information visualization framework. We assume that the exploration of heterogenous information spaces at arbitrary levels of detail requires a suitable preprocessing of information quantities, the combination of different graphical interfaces and the illustration of the frame of reference of given information sets. The innovative features of ou...
متن کاملApproximate Algorithms for Distance-Based Queries in High-Dimensional Data Spaces Using R-Trees
In modern database applications the similarity or dissimilarity of complex objects is examined by performing distance-based queries (DBQs) on data of high dimensionality. The R-tree and its variations are commonly cited multidimensional access methods that can be used for answering such queries. Although, the related algorithms work well for low-dimensional data spaces, their performance degrad...
متن کاملExploiting random projections and sparsity with random forests and gradient boosting methods - Application to multi-label and multi-output learning, random forest model compression and leveraging input sparsity
Within machine learning, the supervised learning field aims at modeling the input-output relationship of a system, from past observations of its behavior. Decision trees characterize the input-output relationship through a series of nested $if-then-else$ questions, the testing nodes, leading to a set of predictions, the leaf nodes. Several of such trees are often combined together for state-of-...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- J. Database Manag.
دوره 15 شماره
صفحات -
تاریخ انتشار 2004